CODRA: A Novel Discriminative Framework for Rhetorical Analysis

نویسندگان

  • Shafiq R. Joty
  • Giuseppe Carenini
  • Raymond T. Ng
چکیده

Clauses and sentences rarely stand on their own in an actual discourse; rather, the relationship between them carries important information that allows the discourse to express a meaning as a whole beyond the sum of its individual parts. Rhetorical analysis seeks to uncover this coherence structure. In this article, we present CODRA— a COmplete probabilistic Discriminative framework for performing Rhetorical Analysis in accordance with Rhetorical Structure Theory, which posits a tree representation of a discourse. CODRA comprises a discourse segmenter and a discourse parser. First, the discourse segmenter, which is based on a binary classifier, identifies the elementary discourse units in a given text. Then the discourse parser builds a discourse tree by applying an optimal parsing algorithm to probabilities inferred from two Conditional Random Fields: one for intra-sentential parsing and the other for multi-sentential parsing. We present two approaches to combine these two stages of parsing effectively. By conducting a series of empirical evaluations over two different data sets, we demonstrate that CODRA significantly outperforms the state-of-the-art, often by a wide margin. We also show that a reranking of the k-best parse hypotheses generated by CODRA can potentially improve the accuracy even further.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Rhetorical Move Analysis in Accounting Textbooks

Textbooks considered as one of the most important genres of academic writings would guarantee effective teaching and learning in EFL/ESL courses. Since textbooks are gaining more and more importance, the knowledge of their rhetorical organization that provides learners and teachers with efficient use of their content seems necessary. This article describes the rhetorical organization of the tex...

متن کامل

A Rhetorical Move Analysis of TEFL Thesis Abstracts: The Case of Allameh Tabataba’i University

Abstract in every research paper has always been functioning as an attention-grabber which can encourage readers to keep reading the research or to dissuade it. Although abstracts are believed to play an important role in distributing the research findings, few studies have been done to evaluate the rhetorical organization of thesis abstracts, especially in the field of Teaching English as a Fo...

متن کامل

Translation of Power and Solidarity Pronouns in Qur’anic Rhetoric

  Translation of the Holy Quran can be difficult for translators in terms of accuracy and translatability. Sometimes translators fail to render the Quranic thoughts because of the lack of language features in target languages. This results in an unfavorable interpretation. One of the challenging aspects of translating Quran is reference switching as rhetorical devices, which are widespread i...

متن کامل

Minimum Conditional Entropy Clustering: A Discriminative Framework for Clustering

In this paper, we introduce an assumption which makes it possible to extend the learning ability of discriminative model to unsupervised setting. We propose an informationtheoretic framework as an implementation of the low-density separation assumption. The proposed framework provides a unified perspective of Maximum Margin Clustering (MMC), Discriminative k -means, Spectral Clustering and Unsu...

متن کامل

A Novel Discriminative Framework for Sentence-Level Discourse Analysis

We propose a complete probabilistic discriminative framework for performing sentencelevel discourse analysis. Our framework comprises a discourse segmenter, based on a binary classifier, and a discourse parser, which applies an optimal CKY-like parsing algorithm to probabilities inferred from a Dynamic Conditional Random Field. We show on two corpora that our approach outperforms the state-of-t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Computational Linguistics

دوره 41  شماره 

صفحات  -

تاریخ انتشار 2015